
STOP 



Early Journal Content on JSTOR, Free to Anyone in the World 

This article is one of nearly 500,000 scholarly works digitized and made freely available to everyone in 
the world by JSTOR. 

Known as the Early Journal Content, this set of works include research articles, news, letters, and other 
writings published in more than 200 of the oldest leading academic journals. The works date from the 
mid-seventeenth to the early twentieth centuries. 

We encourage people to read and share the Early Journal Content openly and to tell others that this 
resource exists. People may post this content online or redistribute in any way for non-commercial 
purposes. 

Read more about Early Journal Content at http://about.jstor.org/participate-jstor/individuals/early- 
journal-content . 



JSTOR is a digital library of academic journals, books, and primary source objects. JSTOR helps people 
discover, use, and build upon a wide range of content through a powerful research and teaching 
platform, and preserves this content for future generations. JSTOR is part of ITHAKA, a not-for-profit 
organization that also includes Ithaka S+R and Portico. For more information about JSTOR, please 
contact support@jstor.org. 



A STANDARD OF INTERPRETATION OF NUMERICAL 

GRADES 



LeROY d. weld 
Coe College, Cedar Rapids, Iowa 



A great deal has been written regarding methods of standard- 
izing teachers in the grading of pupils, and various schemes are in 
use whereby teachers are constrained to exercise arbitrary rules 
of judgment in order to make the results of their grading fulfil 
certain theoretical conditions. 

We are told, for example, that, in some large American colleges 
and universities using forms of the so-called "Missouri system," 
it is expected of teachers that a specified percentage of their assigned 
grades shall be above 90, another specified percentage between 
90 and 80, etc. (or the equivalent of these figures in terms of letters), 
and that the distribution thus sought is approximately that of the 
familiar, symmetrical "probability" or "error" law 

It is not the purpose of this paper to discuss the merits of either 
numerical or literal grade assignments, or of the categories "excel- 
lent," "good," "fair," etc., as against any sort of grade scale. 
We shall start out with the simple fact that among the great masses 
of our school and college teachers and pupils, the one common 
language in which the scholarly attainments of pupils are expressed 
is a scalar one, which may as well be numerical as literal or otherwise 
arbitrary. If we, who live in the Middle West, read in a New York 
magazine that a certain man entered college with an average grade 
of 95 in his preparatory work, we know pretty well what that 
means; and so it is the country over. And it will probably be a 
long time before the people at large will be educated to any other, 
radically different mode of expression. 

The problem now presented is that of establishing a method 
whereby grades assigned by one teacher can be intelligently com- 
pared with those assigned by another, and all brought to a common 
standard. The writer does not believe that this can be accom- 

412 



INTERPRETATION OF NUMERICAL GRADES 413 

plished by forcing teachers to conform to a theoretical system, the 
scientific basis of which they do not understand, and which ignores 
those human elements of sympathy and encouragement which 
make teaching the noble profession that it is. On the contrary, 
we propose that the teacher be let alone, left to exercise her own 
free mode of rating; we shall show how the grades of each teacher 
can be easily translated into terms of a common standard (and 
even into terms of the probability scale, if desired) without that 
teacher's knowing anything about it. Indeed it were better that 
she should not know her own peculiarities in this respect; for let 
her once be conscious that she is not quite normal in the matter of 
grading, and she will immediately begin, though perhaps without 
realizing it, to "doctor" her ratings and give constrained instead 
of natural judgments. 

Let us first examine the ordinary percentage scale of grading 
and the results of its use. There can be no doubt that if it were 
possible to estimate accurately what is called scholarship, or 
proficiency, and express it in units, it would be found to have in 
the long run the symmetrical distribution of the theoretical "error 
law," like shots to right and left of a target, or the statures of 
people above and below the average. 1 But the fact is that teachers 
do not, in grading, take the same attitude toward good students 
and poor ones. Almost without exception, they mark the poor 
students higher in proportion to their attainment than they do the 
good students, thus revealing either the element of sympathy 
already referred to, or some less worthy motive, as of passing along 
dull pupils in order to get rid of them. This tendency has been 
proved in two independent ways: (1) the testimony of the teachers 
themselves, many of whom have been questioned on this point 
and have almost invariably admitted being conscious of an inclina- 
tion to "shove along" the poor pupil and grade him higher than 
he deserves; and (2) the statistical evidence based upon a study 
of many thousands of grades assigned by both public-school and 
college teachers. This latter investigation has given some very 
interesting results, and is the basis of our proposed method of 
standardizing teachers' assigned grades. 

1 See Weld, Theory of Errors and Least Squares, chap. iv. 



414 THE SCHOOL REVIEW 

The grade lists used in this study were obtained in part from 
college records and in larger measure from the records of the 
public schools of Cedar Rapids, kindly furnished by Superintendent 
J. J. McConnell for the purpose, over one hundred thousand 
individual grades being tabulated from lists assigned by about 
one hundred and fifty teachers over a period of several years. The 
public-school grades were numerical, each grade being assigned to 
the nearest multiple of five; for example, the grades 73, 74, 75, 76, 
77 were all called 75, while 78, 79, 80, 81, 82 were called 80, etc. 
The college grades were literal, each having, however, a well- 
understood approximate numerical significance. The work of 
tabulation was carried out by Mr. Leslie L. Fishwild in 1915, his 
summarized results being as follows: 

Out of over 100,000 grades, practically none were below 50. 



1 per cent were 


• 5° 


13 per cent 


were 80 


1 " 


u 


55 


13 


« 


" 8 S 


2 " 


u 


60 


25 


u 


« 90 


2 " 


u 


65 


23 


a 


" 95 


5 " 


a 


70 


9 


u 


100 


6 " 


u 


75 









These results are shown graphically in Fig. 1, the unsymmetrical 
character of which is unmistakable evidence of the tendency to 
crowd poor students up the scale. Fig. 2 shows the normal prob- 
ability distribution assumed to be ideal by the users of that system. 

It is interesting to study in this manner the grades assigned by 
individual teachers, as their personal characteristics in grading are 
brought out very distinctly in this way. Some show much greater 
crowding than this average, some much less; occasionally a teacher 
will show very erratic tendencies; and very rarely one is found 
whose grade distribution is approximately symmetrical as theory 
would demand. 

The writer has taken up the subject of this actual grade distri- 
bution as a mathematical problem, basing the theory upon certain 
very simple assumptions involving three separate personal charac- 
teristics of the individual teacher in grading, the result being a 
formula that agrees very closely with the statistical facts. It has 
been found in practice, however, that one of these characteristics, 



INTERPRETATION OF NUMERICAL GRADES 



415 



viz., the range to which practically all the teachers' assigned grades 
are confined, undergoes little variation and can therefore be 
assumed as constant for all teachers, which leaves but two personal 
characteristics to be determined in order to find the type of marker 
to which the individual teacher belongs; as a mathematician 



Fig. i 



I 
Fig. 2 



would express it, only two parameters are necessary in the teachers' 
grade-distribution formula. This mathematical work may be 
published more appropriately elsewhere; but its outcome is the 
simple and practical method now to be presented, whereby any 
superintendent or principal or college registrar can determine the 
teacher's type of grade assignment from a single semester's grades, 
and be able thereafter to translate the grades given by that teacher 



99 


U 


" ss 


98 


u 


" 6o 


96 


« 


" 6 S 


94 


a 


" 7° 


89 


a 


" 75 



416 THE SCHOOL REVIEW 

to a standard scale, which may be used whenever a student's actual 
ranking is to be determined. 

For the purpose of this method the data are tabulated in a 
more convenient form than the foregoing. Instead of finding the 
percentage of a teacher's grades which are, say, 70 or 65, as in the 
mathematical treatment referred to, we find the percentage which 
are 70 or above, 65 or above, etc., and tabulate these values. It was 
found that of the thousands of grades examined by Mr. Fishwild, 
vary approximately 

100 per cent were 50 or above 83 per cent were 80 or above 

« 70 " " 85 " " 

57 " " 00 " " 

32 " " 95 " " 

9 " " 100 

u 

The passing grade being 75,11 per cent of the grades denote failure. 

While it would, of course, be desirable to make a selection from 
various localities, it is fair to presume that this distribution is not 
far from normal the country over, since the teachers assigning 
these grades were not by any means all of local origin. At any 
rate this distribution, even if not quite the average for the United 
States, will serve our purpose as a reference point and means of 
comparison. It is shown graphically by the heavy curve in Fig. 3, 
along with the corresponding curves for certain individual teachers, 
which are dotted. 

Now the foundation principle of our method is that we may 
expect any one large group of unselected pupils (as those handled by 
one teacher in a year) to have about the same actual scholarship, on 
the average, as any other similar large group. This means that 
radical differences observed in the grade distribution of one teacher 
from that of another have their origin in the characteristic grading 
methods of the teachers themselves rather than in the pupils they 
handle. (If in any case there is reason to believe otherwise, due 
allowance should of course be made for the fact in applying the 
method.) 

We may now proceed to classify teachers into types, according 
to their peculiar characteristics in grading. This may be done as 



INTERPRETATION OF NUMERICAL GRADES 



417 



minutely and over as large a range as we think best, but the writer 
suggests the use of twenty types, consecutively numbered, of 
which the middle ones are nearest to the normal, or standard. It 
is believed that this number will suffice in practice. These types 
are identified, as before mentioned, by means of two simple charac- 



I0O 



90 



Q 



* 80 



<n 70 

10 



ia 60 



so 



*o 



<0 

v. 


VI 



to 



' *■ 












"N. 


N > 


V x v. v 








\ 

\ 


X \ \ 








> 


VV v 

\ \ \ 


\ 

\ 

\ 








\ \ 1 


\ 

\ N 








\ v 


A \ 








\ 










\ 


V v V - 










V \ \ \ 






<SKAO£. 




H 



30 



60 



70 



SO 



90 tOO 

0/» ABOVE. 



Fig. 3 



teristics, for which I have selected (A) the percentage of the 
teacher's assigned grades that are 70 or above, and (B) the per- 
centage that are 90 or above. (The former, A, is more important 
in identifying the type than the latter, B.) 

For the average or standard distribution that we are using, A 
is 94 per cent and B is 56 per cent, which, by the way, is rather 
surprising when we think of it. It is an actual fact, however, that 



418 THE SCHOOL REVIEW 

56 per cent of all the grades examined by Mr. Fishwild were 90 
or above, which exhibits more strikingly than ever the general 
tendency to crowd up the scale. 

For individual teachers, A and B will have different sets of 
values, and, after considerable study of actual distributions, the 
twenty types shown in Table I (at end of paper) have been selected 
as fairly representative of the range likely to be encountered. In 
general the first types correspond to the consistently low markers 
and the last to the consistently high markers, while the middle 
type (11) is about normal. Provision has been made, also, for 
certain types that for some reason are high markers of good pupils 
and low markers of poor pupils, or vice versa; it is certain that such 
types exist. In practice a judicious combination of two types 
may sometimes be found satisfactory with gradual transition along 
the scale from one type to the other. 

The next step in the development of the method was to Compare 
the grade distributions corresponding to the respective types with 
the standard distribution. Familiarity with the form of the 
distribution curve (Fig. 3), through plotting many individual grade 
distributions, made it possible to trace curves with fair accuracy 
when only the two characteristic points at 70 and 90 were given. 
A number of such curves are shown dotted in Fig. 3. The 
grade comparison then became a simple matter. For example, it 
was found from the curve for Type 16 that grades of 80 or above 
are given by teachers of this type in about 92 per cent of their 
gradings, while we observe that this same percentage of the gradings 
of the standard marker (Type 11) are 73 or above. We may 
therefore conclude that the grade 80 given by a teacher of Type 16 
corresponds to the grade 73 on the standard scale of marking. In 
a similar manner, if a teacher of Type 6 gives a pupil the grade 80, 
it is equivalent to 83 on the standard scale, this teacher being a 
low marker. 

Proceeding in this manner, it has been found possible to con- 
struct an approximate table (Table II), whereby such translations 
may be made at a glance as soon as the type to which the teacher 
belongs has been decided upon. Grades intermediate between 
those provided for in the table can be easily interpolated. 



INTERPRETATION OF NUMERICAL GRADES 419 

The possession of Tables I and II, with a semester or so of grade 
reports from the school system, will be sufficient to enable any 
school superintendent or college administrator to accomplish the 
desired standardization and comparison. The method of procedure 
may be summarized, for practical use, in the form of the following 
directions: 

1. To standardize a teacher's grading, examine a considerable number of 
that teacher's grades (the more the better), ascertaining (A) what percentage 
of them are 70 or above, and (B) what percentage are 90 or above. 

2. Find from Table I the type that corresponds most nearly with these 
characteristics, A and B, and assign the teacher to that type (a judicious 
combination of types may prove satisfactory). A should have more influence 
than B in selecting the type. 

3. The standard grade, which corresponds to any one of the teacher's 
assigned grades appearing in the top row of Table II, is given below it in the 
same column, opposite the teacher's type number. 

Example: It is found that 92 per cent of the grades assigned 
by Miss M. are 70 or above, and 45 per cent are 90 or above. 
Referring to Table I, it is seen that Miss M. belongs in the neigh- 
borhood of Type 8. She is a low marker. The horizontal row 
opposite Type 8 in Table II now shows that Miss M.'s 50 is 
equivalent to a standard 54, her 55 to a standard 59, etc. 

If a teacher is found whose characteristics are nowhere near 
being represented by any of the types, as may sometimes occur, it 
is evidence of some very erratic habit of grading, and more complete 
tabulation of the grades will be desirable. This is likely to reveal 
inconsistencies which can be explained only by an utter lack of 
system in grading or of appreciation of what grades mean; in any 
case that teacher will bear watching in this respect at least. Mr. 
Fishwild ran across two or three cases of this sort in his research. 
Such instances are, however, exceptional and need not interfere 
with the general application of the method. 

The value of such information as this method furnishes need 
hardly be enlarged upon. An illustration is found in the problem 
of selecting the honor students from the members of a graduating 
class who have had their instruction under different groups of 
teachers; or again, in the investigation of complaints as to a 



420 



THE SCHOOL REVIEW 



TABLE I 
Types of Teachers 





A 


B 


Type 


Percentage 70 
or Above 


Percentage 00 
or Above 




85 
86 

87 
88 
89 
90 

91 
92 

93 
93 
94 
95 
95 
96 

97 
97 
98 
98 
99 
99 




2 


25 
35 
3° 
40 

45 
65 
5° 
55 
60 








6 


7 


8 




IO 


ii 


55 
3° 
45 
60 


12 


13 


14 


15 


40 
70 
3° 
35 
45 
5° 


16 


17 


18 









TABLE II 
For the Translation of Grades to Standard Scale 



Type of Teacher 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

IS 

16 

17 

18 

19 
20 



Grade Assigned by Teacher of Given Type 



55 
55 
60 

52 
60 

59 
55 
54 
54 
60 
50 
5° 
5° 
5° 
5° 
5° 
5° 
5° 
5° 
5° 



ss 

65 

65 
65 
58 
65 
64 
60 
59 
59 
63 
55 
5° 
55 
54 
55 
55 
53 
53 
53 
5i 



60 

70 
70 
68 
66 
68 
67 
65 
64 
62 
66 
60 
58 
58 
58 
56 
57 
55 
55 
56 
53 



OS 

74 
74 
73 
72 
72 
72 
70 
68 
67 
68 
65 
63 
63 
62 

59 
59 
58 

58 
57 
54 



78 

77 
77 
76 

75 
75 
74 
73 
7i 
72 
70 
68 
68 
67 
64 
64 
61 
61 
58 
55 



75 

83 

82 
82 
81 
80 
80 
77 
77 
76 
76 
75 
75 
74 
73 
72 
69 
70 

7i 
64 
63 



80 

87 

86 

85 
86 

84 
83 
80 
82 
81 
80 
80 
82 
79 
79 
80 

73 
78 
79 
73 
76 



85 

92 

9i 
90 

9i 



83 
86 

85 
84 

85 
88 

85 
84 
87 
79 

87 
86 
82 
84 



90 

97 
96 

94 
95 
93 
92 

87 
9i 
90 
89 
90 
95 
93 
89 
93 
8S 
95 
94 
92 

9i 



100 
99 



97 
97 
94 
96 

95 
94 
95 

100 
98 
94 
98 
92 

100 

99 
98 

97 



100 

100 
100 
100 
100 
100 
100 

99 
100 
100 

99 
100 
100 
100 
100 
100 

99 
100 
100 
100 
100 



INTERPRETATION OF NUMERICAL GRADES 421 

teacher's grading or of the suspicion that a teacher is being too 
easy, etc. 

It is further to be noted that the standards of grading in different 
schools may be compared in exactly the same manner as those of 
different teachers, the process being capable of considerably greater 
refinement because of the larger amount of data available. This 
might be made use of in the rating of high schools by college- 
entrance boards, so that, for example, a student coming from a 
certain high school with an average of 87 could be considered to 
have an average of 85 from a standard high school. 

It is the purpose of the writer, as time permits, to gather data 
from a wider field and by their use to improve Tables I and II, so 
that they may attain the greatest possible accuracy and applicabil- 
ity. Statistics of this kind, and suggestions of educators relative 
to this subject, will be appreciated. Meanwhile it is hoped that 
the method as presented, with the accompanying tables, will be 
found useful. 



